How to use DataFrame_Name[‘column_name’].plot() function In Python Pandas
We can create a box plot on each column of a Pandas DataFrame by following the below syntax-
DataFrame_Name[‘column_name’].plot(kind=’box’, title=’title_of_plot’)
Note: We can find first quartile values, median, third quartile values using quantile method.
Syntax to find quartiles
data.quantile([0.25,0.5,0.75])
- 0.25 indicates the first quartile.
- 0.5 indicates the median value.
- 0.75 indicates the third quartile.
Example to find quartiles of a data
Python3
# import necessary packages import pandas as pd data = pd.Series([ 1 , 2 , 3 , 4 , 5 , 6 ]) # find quartile values print (data.quantile([ 0.25 , 0.5 , 0.75 ])) |
Output
0.25 2.25 0.50 3.50 0.75 4.75 dtype: float64
Consider the below data to create a DataFrame and to plot a box plot on it.
Name |
Marks |
Credits |
---|---|---|
Akhil |
77 |
8 |
Nikhil |
95 |
10 |
Satyam |
89 |
9 |
Sravan |
78 |
8 |
Pavan |
64 |
7 |
Example:
Create a DataFrame using the above data and plot the Boxplot on Marks of a student. The bottom line indicates the minimum marks of a student and the top line indicates the maximum marks of a student. Between the bottom and top, the middle 3 lines indicate 1st quartile, median, and 3rd quartile respectively.
Python3
# import necessary packages import pandas as pd import matplotlib.pyplot as plt # create a dataframe data = pd.DataFrame({ 'Name' : [ 'Akhil' , 'Nikhil' , 'Satyam' , 'Sravan' , 'Pavan' ], 'Marks' : [ 77 , 95 , 89 , 78 , 64 ], 'Credits' : [ 8 , 10 , 9 , 8 , 7 ]}) # box plot data[ 'Marks' ].plot(kind = 'box' , title = 'Marks of students' ) plt.show() |
Output:
Example:
In this example, the minimum mark of the student is 10 which is very small and far from other marks (data points). So it is indicated as o at the bottom which represents an outlier. If any of the data points in the data is much larger or smaller compared to other values then the following plot will be generated.
Python3
# import necessary packages import pandas as pd import matplotlib.pyplot as plt # create a dataframe data = pd.DataFrame({ 'Name' : [ 'Akhil' , 'Nikhil' , 'Satyam' , 'Sravan' , 'Pavan' ], 'Marks' : [ 77 , 95 , 89 , 78 , 10 ], 'Credits' : [ 8 , 10 , 9 , 8 , 0 ]}) # outlier box plot data[ 'Marks' ].plot(kind = 'box' , title = 'Marks of students' ) plt.show() |
Output:
How to Create Boxplot from Pandas DataFrame?
Box plot is also called a Whisker plot which provides a summary of a set of data that includes minimum, first-quartile, median, third quartile, and maximum value. This Box plot is present in the matplotlib library. In the Box plot graph, the x-axis represents the data we are going to plot and the y-axis represents frequency.